在本文中,我们构建了两个自动评估度量,用于评估机器生成的标题和地面真理体型中的关联:overtyle和风格德。
translated by 谷歌翻译
在建模重复的法院游戏时,许多过去的尝试都认为需求是静止的。这与现实世界的情景不一致,其中市场需求可以通过产品的一生以无数的原因来实现。在本文中,我们模拟了重复的Cournot游戏,不符合非静止需求,使得公司/代理人面临非静止多武装强盗问题的单独实例。代理可以选择的武器/行动代表离散生产量;这里,排序动作空间。代理商是独立和自主的,无法观察到环境中的任何事情;他们只能在采取行动后看到自己的奖励,只能努力最大化这些奖励。我们提出了一种新颖的算法对加权探索(AWE)$ \ EPSILON $ -GREEDY'的自适应,这些探索基于众所周知的$ \ epsilon $ -greedy方法远程。该算法检测和量化由于不同的市场需求而导致的奖励的变化,并与需求变化程度的程度不同,从而使代理能够更好地识别新的最佳动作。为了有效探索,它还部署了一种用于称重利用有序动作空间的动作的机制。我们使用模拟来研究市场上各种均衡的出现。此外,我们在系统中的总代理数量和行动空间的大小之间研究了我们的方法的可扩展性。我们在我们的模型中考虑对称和不对称的公司。我们发现,使用我们提出的方法,代理商能够根据需求的变化迅速改变他们的行动方针,并且在许多模拟中也从事契合行为。
translated by 谷歌翻译
我们调查了多辅助多武装强盗(MA-MAB)设置来建模重复的Cournot寡头寡头寡头杆游戏,该公司作为代理的公司从代表生产量(离散值)的武器中选择。代理商与单独和独立的强盗问题交互。在这种制定中,每个代理人在武器之间进行连续选择,以最大化自己的奖励。代理商没有有关环境的任何信息;在采取行动后,他们只能看到自己的奖励。但是,市场需求是行业总产量的静止功能,不允许随机进入或从市场退出。鉴于这些假设,我们发现$ \ epsilon $ -greedy方法提供比其他传统MAB方法更加可行的学习机制,因为它不需要对系统进行任何额外的知识来运作。我们还提出了两种旨在利用订购的行动空间:$ \ epsilon $ -greedy + hl和$ \ epsilon $ -greedy + el。这些新方法通过消除较少的有利可图的选择,帮助公司专注于更有利可图的行动,从而旨在优化勘探。我们使用计算机模拟来研究结果中各种均衡的出现,并对关节累积遗憾进行实证分析。
translated by 谷歌翻译
在本文中,我们建议通过多样式多模态机制(2M)来构建时尚的图像标题模型。我们证明,使用2M,我们可以构建有效的时尚标题器,并且通过识别错误示例的错误输入功能,模型产生的多引用也可以支持解释模型。我们展示了这款2M机制如何用于构建时尚的标题模型,并展示这些模型如何用于提供模型中可能错误的解释。
translated by 谷歌翻译
在现实世界中,人/实体通常独立和自主地找到匹配,例如寻找工作,合作伙伴,室友等。这一搜索可能无法对环境的初始知识开始。我们建议使用多档强化学习(MARL)范式,以便在空间制定的分散双面匹配市场与独立和自主代理商。独立行动的自主代理使我们的环境非常动态和不确定。此外,代理商缺乏对其他代理人的偏好知识,并必须探索环境并与其他代理商互动,通过嘈杂的奖励来发现自己的偏好。我们认为这样的设置更好地近似了现实世界,我们研究了我们的Marl方法对它的有用性。除了传统的稳定匹配情况下,代理程序严格排序偏好,我们检查了我们与不完整名单和联系的稳定匹配方法的适用性。我们调查我们的稳定性,不稳定水平(不稳定的结果)和公平性。我们的Marl方法主要产生稳定和公平的结果。
translated by 谷歌翻译
Research has shown that climate change creates warmer temperatures and drier conditions, leading to longer wildfire seasons and increased wildfire risks in the United States. These factors have in turn led to increases in the frequency, extent, and severity of wildfires in recent years. Given the danger posed by wildland fires to people, property, wildlife, and the environment, there is an urgency to provide tools for effective wildfire management. Early detection of wildfires is essential to minimizing potentially catastrophic destruction. In this paper, we present our work on integrating multiple data sources in SmokeyNet, a deep learning model using spatio-temporal information to detect smoke from wildland fires. Camera image data is integrated with weather sensor measurements and processed by SmokeyNet to create a multimodal wildland fire smoke detection system. We present our results comparing performance in terms of both accuracy and time-to-detection for multimodal data vs. a single data source. With a time-to-detection of only a few minutes, SmokeyNet can serve as an automated early notification system, providing a useful tool in the fight against destructive wildfires.
translated by 谷歌翻译
Applying deep learning concepts from image detection and graph theory has greatly advanced protein-ligand binding affinity prediction, a challenge with enormous ramifications for both drug discovery and protein engineering. We build upon these advances by designing a novel deep learning architecture consisting of a 3-dimensional convolutional neural network utilizing channel-wise attention and two graph convolutional networks utilizing attention-based aggregation of node features. HAC-Net (Hybrid Attention-Based Convolutional Neural Network) obtains state-of-the-art results on the PDBbind v.2016 core set, the most widely recognized benchmark in the field. We extensively assess the generalizability of our model using multiple train-test splits, each of which maximizes differences between either protein structures, protein sequences, or ligand extended-connectivity fingerprints. Furthermore, we perform 10-fold cross-validation with a similarity cutoff between SMILES strings of ligands in the training and test sets, and also evaluate the performance of HAC-Net on lower-quality data. We envision that this model can be extended to a broad range of supervised learning problems related to structure-based biomolecular property prediction. All of our software is available as open source at https://github.com/gregory-kyro/HAC-Net/.
translated by 谷歌翻译
We propose AnyTOD, an end-to-end task-oriented dialog (TOD) system with zero-shot capability for unseen tasks. We view TOD as a program executed by a language model (LM), where program logic and ontology is provided by a designer in the form of a schema. To enable generalization onto unseen schemas and programs without prior training, AnyTOD adopts a neuro-symbolic approach. A neural LM keeps track of events that occur during a conversation, and a symbolic program implementing the dialog policy is executed to recommend next actions AnyTOD should take. This approach drastically reduces data annotation and model training requirements, addressing a long-standing challenge in TOD research: rapidly adapting a TOD system to unseen tasks and domains. We demonstrate state-of-the-art results on the STAR and ABCD benchmarks, as well as AnyTOD's strong zero-shot transfer capability in low-resource settings. In addition, we release STARv2, an updated version of the STAR dataset with richer data annotations, for benchmarking zero-shot end-to-end TOD models.
translated by 谷歌翻译
We consider the sequential decision-making problem of making proactive request assignment and rejection decisions for a profit-maximizing operator of an autonomous mobility on demand system. We formalize this problem as a Markov decision process and propose a novel combination of multi-agent Soft Actor-Critic and weighted bipartite matching to obtain an anticipative control policy. Thereby, we factorize the operator's otherwise intractable action space, but still obtain a globally coordinated decision. Experiments based on real-world taxi data show that our method outperforms state of the art benchmarks with respect to performance, stability, and computational tractability.
translated by 谷歌翻译
Modern machine learning requires system designers to specify aspects of the learning pipeline, such as losses, architectures, and optimizers. Meta-learning, or learning-to-learn, instead aims to learn those aspects, and promises to unlock greater capabilities with less manual effort. One particularly ambitious goal of meta-learning is to train general-purpose in-context learning algorithms from scratch, using only black-box models with minimal inductive bias. Such a model takes in training data, and produces test-set predictions across a wide range of problems, without any explicit definition of an inference model, training loss, or optimization algorithm. In this paper we show that Transformers and other black-box models can be meta-trained to act as general-purpose in-context learners. We characterize phase transitions between algorithms that generalize, algorithms that memorize, and algorithms that fail to meta-train at all, induced by changes in model size, number of tasks, and meta-optimization. We further show that the capabilities of meta-trained algorithms are bottlenecked by the accessible state size (memory) determining the next prediction, unlike standard models which are thought to be bottlenecked by parameter count. Finally, we propose practical interventions such as biasing the training distribution that improve the meta-training and meta-generalization of general-purpose learning algorithms.
translated by 谷歌翻译